我们解决了现实世界用户生成的离散事件序列上的自我监督学习问题。自我监督的学习将来自原始数据的复杂信息包含在低维固定长度矢量表示中,这些信息可以轻松地应用于各种下游机器学习任务中。在本文中,我们提出了一种新方法“ COLES”,该方法将以前用于音频和计算机视觉域的对比度学习适应自我监督的设置中的离散事件序列域。我们根据大型欧洲金融服务公司的交易序列部署了COLES嵌入。 COLES嵌入的用法显着提高了预先存在的模型在下游任务上的性能,并产生了巨大的财务收益,每年以数亿美元的价格衡量。我们还在几个公共事件序列数据集上评估了COLES,并表明COLES表示在不同的下游任务上始终超过其他方法。
translated by 谷歌翻译
The monograph summarizes and analyzes the current state of development of computer and mathematical simulation and modeling, the automation of management processes, the use of information technologies in education, the design of information systems and software complexes, the development of computer telecommunication networks and technologies most areas that are united by the term Industry 4.0
translated by 谷歌翻译
Recent work leverages the expressive power of generative adversarial networks (GANs) to generate labeled synthetic datasets. These dataset generation methods often require new annotations of synthetic images, which forces practitioners to seek out annotators, curate a set of synthetic images, and ensure the quality of generated labels. We introduce the HandsOff framework, a technique capable of producing an unlimited number of synthetic images and corresponding labels after being trained on less than 50 pre-existing labeled images. Our framework avoids the practical drawbacks of prior work by unifying the field of GAN inversion with dataset generation. We generate datasets with rich pixel-wise labels in multiple challenging domains such as faces, cars, full-body human poses, and urban driving scenes. Our method achieves state-of-the-art performance in semantic segmentation, keypoint detection, and depth estimation compared to prior dataset generation approaches and transfer learning baselines. We additionally showcase its ability to address broad challenges in model development which stem from fixed, hand-annotated datasets, such as the long-tail problem in semantic segmentation.
translated by 谷歌翻译
Building systems that achieve a deeper understanding of language is one of the central goals of natural language processing (NLP). Towards this goal, recent works have begun to train language models on narrative datasets which require extracting the most critical information by integrating across long contexts. However, it is still an open question whether these models are learning a deeper understanding of the text, or if the models are simply learning a heuristic to complete the task. This work investigates this further by turning to the one language processing system that truly understands complex language: the human brain. We show that training language models for deeper narrative understanding results in richer representations that have improved alignment to human brain activity. We further find that the improvements in brain alignment are larger for character names than for other discourse features, which indicates that these models are learning important narrative elements. Taken together, these results suggest that this type of training can indeed lead to deeper language understanding. These findings have consequences both for cognitive neuroscience by revealing some of the significant factors behind brain-NLP alignment, and for NLP by highlighting that understanding of long-range context can be improved beyond language modeling.
translated by 谷歌翻译
Magnetic Resonance Fingerprinting (MRF) is an efficient quantitative MRI technique that can extract important tissue and system parameters such as T1, T2, B0, and B1 from a single scan. This property also makes it attractive for retrospectively synthesizing contrast-weighted images. In general, contrast-weighted images like T1-weighted, T2-weighted, etc., can be synthesized directly from parameter maps through spin-dynamics simulation (i.e., Bloch or Extended Phase Graph models). However, these approaches often exhibit artifacts due to imperfections in the mapping, the sequence modeling, and the data acquisition. Here we propose a supervised learning-based method that directly synthesizes contrast-weighted images from the MRF data without going through the quantitative mapping and spin-dynamics simulation. To implement our direct contrast synthesis (DCS) method, we deploy a conditional Generative Adversarial Network (GAN) framework and propose a multi-branch U-Net as the generator. The input MRF data are used to directly synthesize T1-weighted, T2-weighted, and fluid-attenuated inversion recovery (FLAIR) images through supervised training on paired MRF and target spin echo-based contrast-weighted scans. In-vivo experiments demonstrate excellent image quality compared to simulation-based contrast synthesis and previous DCS methods, both visually as well as by quantitative metrics. We also demonstrate cases where our trained model is able to mitigate in-flow and spiral off-resonance artifacts that are typically seen in MRF reconstructions and thus more faithfully represent conventional spin echo-based contrast-weighted images.
translated by 谷歌翻译
Language models have been shown to be very effective in predicting brain recordings of subjects experiencing complex language stimuli. For a deeper understanding of this alignment, it is important to understand the alignment between the detailed processing of linguistic information by the human brain versus language models. In NLP, linguistic probing tasks have revealed a hierarchy of information processing in neural language models that progresses from simple to complex with an increase in depth. On the other hand, in neuroscience, the strongest alignment with high-level language brain regions has consistently been observed in the middle layers. These findings leave an open question as to what linguistic information actually underlies the observed alignment between brains and language models. We investigate this question via a direct approach, in which we eliminate information related to specific linguistic properties in the language model representations and observe how this intervention affects the alignment with fMRI brain recordings obtained while participants listened to a story. We investigate a range of linguistic properties (surface, syntactic and semantic) and find that the elimination of each one results in a significant decrease in brain alignment across all layers of a language model. These findings provide direct evidence for the role of specific linguistic information in the alignment between brain and language models, and opens new avenues for mapping the joint information processing in both systems.
translated by 谷歌翻译
People constantly use language to learn about the world. Computational linguists have capitalized on this fact to build large language models (LLMs) that acquire co-occurrence-based knowledge from language corpora. LLMs achieve impressive performance on many tasks, but the robustness of their world knowledge has been questioned. Here, we ask: do LLMs acquire generalized knowledge about real-world events? Using curated sets of minimal sentence pairs (n=1215), we tested whether LLMs are more likely to generate plausible event descriptions compared to their implausible counterparts. We found that LLMs systematically distinguish possible and impossible events (The teacher bought the laptop vs. The laptop bought the teacher) but fall short of human performance when distinguishing likely and unlikely events (The nanny tutored the boy vs. The boy tutored the nanny). In follow-up analyses, we show that (i) LLM scores are driven by both plausibility and surface-level sentence features, (ii) LLMs generalize well across syntactic sentence variants (active vs passive) but less well across semantic sentence variants (synonymous sentences), (iii) some, but not all LLM deviations from ground-truth labels align with crowdsourced human judgments, and (iv) explicit event plausibility information emerges in middle LLM layers and remains high thereafter. Overall, our analyses reveal a gap in LLMs' event knowledge, highlighting their limitations as generalized knowledge bases. We conclude by speculating that the differential performance on impossible vs. unlikely events is not a temporary setback but an inherent property of LLMs, reflecting a fundamental difference between linguistic knowledge and world knowledge in intelligent systems.
translated by 谷歌翻译
Pretrained language models that have been trained to predict the next word over billions of text documents have been shown to also significantly predict brain recordings of people comprehending language. Understanding the reasons behind the observed similarities between language in machines and language in the brain can lead to more insight into both systems. Recent works suggest that the prediction of the next word is a key mechanism that contributes to the alignment between the two. What is not yet understood is whether prediction of the next word is necessary for this observed alignment or simply sufficient, and whether there are other shared mechanisms or information that is similarly important. In this work, we take a first step towards a better understanding via two simple perturbations in a popular pretrained language model. The first perturbation is to improve the model's ability to predict the next word in the specific naturalistic stimulus text that the brain recordings correspond to. We show that this indeed improves the alignment with the brain recordings. However, this improved alignment may also be due to any improved word-level or multi-word level semantics for the specific world that is described by the stimulus narrative. We aim to disentangle the contribution of next word prediction and semantic knowledge via our second perturbation: scrambling the word order at inference time, which reduces the ability to predict the next word, but maintains any newly learned word-level semantics. By comparing the alignment with brain recordings of these differently perturbed models, we show that improvements in alignment with brain recordings are due to more than improvements in next word prediction and word-level semantics.
translated by 谷歌翻译
Chromosome analysis is essential for diagnosing genetic disorders. For hematologic malignancies, identification of somatic clonal aberrations by karyotype analysis remains the standard of care. However, karyotyping is costly and time-consuming because of the largely manual process and the expertise required in identifying and annotating aberrations. Efforts to automate karyotype analysis to date fell short in aberration detection. Using a training set of ~10k patient specimens and ~50k karyograms from over 5 years from the Fred Hutchinson Cancer Center, we created a labeled set of images representing individual chromosomes. These individual chromosomes were used to train and assess deep learning models for classifying the 24 human chromosomes and identifying chromosomal aberrations. The top-accuracy models utilized the recently introduced Topological Vision Transformers (TopViTs) with 2-level-block-Toeplitz masking, to incorporate structural inductive bias. TopViT outperformed CNN (Inception) models with >99.3% accuracy for chromosome identification, and exhibited accuracies >99% for aberration detection in most aberrations. Notably, we were able to show high-quality performance even in "few shot" learning scenarios. Incorporating the definition of clonality substantially improved both precision and recall (sensitivity). When applied to "zero shot" scenarios, the model captured aberrations without training, with perfect precision at >50% recall. Together these results show that modern deep learning models can approach expert-level performance for chromosome aberration detection. To our knowledge, this is the first study demonstrating the downstream effectiveness of TopViTs. These results open up exciting opportunities for not only expediting patient results but providing a scalable technology for early screening of low-abundance chromosomal lesions.
translated by 谷歌翻译
电子邮件网络钓鱼变得越来越普遍,随着时间的流逝,网络钓鱼变得更加复杂。为了打击这一上升,已经开发了许多用于检测网络钓鱼电子邮件的机器学习(ML)算法。但是,由于这些算法训练的电子邮件数据集有限,因此它们不擅长识别各种攻击,因此遭受了概念漂移的困扰。攻击者可以在其电子邮件或网站的统计特征上引入小小的变化,以成功绕过检测。随着时间的流逝,文献所报告的准确性与算法在现实世界中的实际有效性之间存在差距。这以频繁的假阳性和假阴性分类意识到自己。为此,我们建议对电子邮件进行多维风险评估,以减少攻击者调整电子邮件并避免检测的可行性。这种横向发送网络钓鱼检测配置文件的水平方法在其主要功能上发出了传入的电子邮件。我们开发了一个风险评估框架,其中包括三个模型,分析了电子邮件(1)威胁级别,(2)认知操纵和(3)电子邮件类型,我们合并了这些电子邮件类型以返回最终的风险评估评分。剖面人员不需要大量的数据集进行训练以有效,其对电子邮件功能的分析会减少概念漂移的影响。我们的参考器可以与ML方法结合使用,以减少其错误分类或作为培训阶段中大型电子邮件数据集的标签。我们在9000个合法的数据集中,使用最先进的ML算法评估了剖面人员对机器学习合奏的功效,并从一个大型澳大利亚大型研究组织的900个网络钓鱼电子邮件中进行了效力。我们的结果表明,探查者的概念漂移的影响减少了30%的假阳性,对ML合奏方法的虚假负面电子邮件分类少25%。
translated by 谷歌翻译